NERD: A Framework for Evaluating Named Entity Recognition Tools in the Web of Data
نویسندگان
چکیده
In this paper, we present NERD, an evaluation framework we have developed that records and analyzes ratings of Named Entity (NE) extraction and disambiguation tools working on English plain text articles performed by human beings. NERD enables the comparison of different popular Linked Data entity extractors which expose APIs such as AlchemyAPI, DBPedia Spotlight, Extractiv, OpenCalais and Zemanta. Given an article and a particular tool, a user can assess the precision of the named entities extracted, their typing and linked data URI provided for disambiguation and their subjective relevance for the text. All user interactions are stored in a database. We propose the NERD ontology that defines mappings between the types detected by the different NE extractors. The NERD framework enables then to visualize the comparative performance of these tools with respect to human assessment.
منابع مشابه
NERD: Evaluating Named Entity Recognition Tools in the Web of Data
The Web of data promotes the idea that more and more data are interconnected. A step towards this goal is to bring more structured annotations to existing documents using common vocabularies or ontologies. Semi-structured texts such as scientific, medical or news articles as well as forum and archived mailing list threads or (micro-)blog posts can hence be semantically annotated. Named Entity (...
متن کاملNERD: A Framework for Unifying Named Entity Recognition and Disambiguation Extraction Tools
Named Entity Extraction is a mature task in the NLP field that has yielded numerous services gaining popularity in the Semantic Web community for extracting knowledge from web documents. These services are generally organized as pipelines, using dedicated APIs and different taxonomy for extracting, classifying and disambiguating named entities. Integrating one of these services in a particular ...
متن کاملCrowdsourced Entity Markup
Entities, such as people, places, products, etc., exist in knowledge bases and linked data, on one hand, and in web pages, news articles, and social media, on the other hand. Entity markup, like Named Entities Recognition and Disambiguation (NERD), is the essential means for adding semantic value to unstructured web contents and this way enabling the linkage between unstructured and structured ...
متن کاملImprovement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination
Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...
متن کاملBenchmarking the Extraction and Disambiguation of Named Entities on the Semantic Web
Named entity recognition and disambiguation are of primary importance for extracting information and for populating knowledge bases. Detecting and classifying named entities has traditionally been taken on by the natural language processing community, whilst linking of entities to external resources, such as those in DBpedia, has been tackled by the Semantic Web community. As these tasks are tr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011